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Abstract 

3D point clouds of natural environments relevant to problems 
in geomorphology (rivers, coastal environments, cliffs,...) of- 
ten require classification of the data into elementary relevant 
classes. A typical example is the separation of riparian veg- 
etation from ground in fluvial environments, the distinction 
between fresh surfaces and rockfall in cliff environments, or 
more generally the classification of surfaces according to their 
morphology (e.g. the presence of bedforms or by grain size). 
Natural surfaces are heterogeneous and their distinctive prop- 
erties are seldom defined at a unique scale, prompting the use 
of multi-scale criteria to achieve a high degree of classifica- 
tion success. We have thus defined a multi-scale measure of 
the point cloud dimensionality around each point. The di- 
mensionality characterizes the local 3D organization of the 
point cloud within spheres centered on the measured points 
and varies from being ID (points set along a line), 2D (points 
forming a plane) to the full 3D volume. By varying the diam- 
eter of the sphere, we can thus monitor how the local cloud 
geometry behaves across scales. We present the technique 
and illustrate its efficiency in separating riparian vegetation 
from ground and classifying a mountain stream as vegetation, 
rock, gravel or water surface. In these two cases, separating 
the vegetation from ground or other classes achieve accuracy 
larger than 98 %. Comparison with a single scale approach 
shows the superiority of the multi-scale analysis in enhancing 
class separability and spatial resolution of the classification. 
Scenes between ten and one hundred million points can be 
classified on a common laptop in a reasonable time. The tech- 
nique is robust to missing data, shadow zones and changes in 
point density within the scene. The classification is fast and 
accurate and can account for some degree of intra-class mor- 
phological variability such as different vegetation types. A 
probabilistic confidence in the classification result is given at 
each point, allowing the user to remove the points for which 
the classification is uncertain. The process can be both fully 
automated (minimal user input once, all scenes treated in 
large computation batches), but also fully customized by the 
user including a graphical definition of the classifiers if so 
desired. Working classifiers can be exchanged between users 
independently of the instrument used to acquire the data 
avoiding the need to go through full training of the classifier. 
Although developed for fully 3D data, the method can be 
readily applied to 2.5D airborne lidar data. 



1 Introduction 

Terrestrial laser scanning (TLS) is now frequently used 
in earth sciences studies to achieve greater precision and 
completeness in surveying natural environments than 
what was feasible a few years ago. Having an almost 
complete and precise documentation of natural surfaces 
has opened up several new scientific applications. These 
include the detailed analysis of geometric properties of 
natural surfaces over a wide range of scales (from a 
few cm to km), such as 3D stratigraphic reconstruc- 
tion and outcrop analysis |22l [55] , grain size distribution 
in rivers [T71 [THl [T5], dune fields[3TJ [3D], vegetation hy- 
draulic roughness [3 [Ij , channel bed dynamics [23] and 
in situ monitoring of clifF erosion and rockfall character- 
istics [Tl [^ [Ml [551 [15] .For all these applications, precise 
automated classification procedures that can pre-process 
complex 3D point cloud in a variety of natural environ- 
ments are highly desirable. Typical examples of appli- 
cations are the separation of vegetation from ground or 
cliff outcrops, the distinction between fresh rock surfaces 
and rockfall, the classification of flat or rippled bed and 
more generally the classification of surfaces according to 
their morphology. Yet, developing such procedures in 
the context of geomorphologic applications remains dif- 
ficult for four reasons : (1) the 3D nature of the data as 
opposed to the traditional 2D structures of digital eleva- 
tion models (DEM), (2) the variable degree of resolution 
and completeness of the data due to inevitable shadow- 
ing effects, (3) the natural heterogeneity and complexity 
of natural surfaces, and (4) the large amount of data 
that is now generated by modern TLS. In the following 
we describe these difficulties and how efficient 3D clas- 
sification is critically needed to advance our use of TLS 
data in natural environments. 

1. Terrestrial lidar data are mostly 3D as opposed 
to digital elevation models or airborne lidar data which 
can be considered 2.5D. This means that traditional data 
analysis methods based on raster formats (in particular 
the separation of vegetation from ground, e.g. |42]) or 
2D vector data processing cannot in general be applied to 
ground based lidar data. In some cases, the studied area 
in the 3D point cloud is mostly 2D at the scale of interest 



Fig. 1 Left : Steep mountain river bed in the Otira gorge (New-Zealand), and Terrestrial Laser Scanner location. 
Right: part of the point cloud rendered using PCV technique in CloudCompare fT^ showing the full 3D nature of 
the scene (3 millions points, minimum point spacing = 1 cm). Identifying key elementary classes such as vegetation, 
rock surfaces, gravels or water surfaces would allow to study the vertical distribution of vegetation, the water surface 
profile, to segment large boulders, or to measure changes in gravel cover and thickness between surveys. 




(i.e., river bed [T7], cliff [361 [2], estuaries [T^) and can be 
projected and gridded to use existing fast raster based 
methods. However in many cases the natural surface is 
3D and there is no simple way to turn it into a 2D surface 
(e.g.. Fig. fl]). In other cases rasterizing a large scale 
2D surface becomes non-trivial when sub-pixel features 
(vegetation, gravel, fractures...) are significantly 3D. In a 
river bed for instance, this amounts at locally classifying 
the data in terms of bed surface and over-bed features 
(typically vegetation) which requires a 3D classification 
approach. 

2. Terrestrial lidar datasets are all prone to a variable 
degree to shadow effects and missing data (water surface 
for instance) inherent to the ground based location of 
the sensor and the roughness characteristics of natural 
surfaces (e.g. [ij. While multiple scanning positions can 
significantly reduce this issue, it is sometimes not feasible 
in the field due to limited access or time. Interpolation 
can be used to fill in missing information (e.g., meshing 
the surface), but it is quite complicated in 3D, and can 
lead to spurious results owing to the high geometrical 
complexity of natural surfaces. Arguably, interpolation 
should be used as a last resort, and in particular only 
after the 3D scene has been correctly classified to remove, 
for instance, vegetation. Hence, any method to classify 
3D point clouds should account for shadow effects, either 
by being insensitive to it, or by factoring in that data are 
locally missing. 

3. As shown in a scan of a steep mountain stream, 
natural surfaces can exhibit complex geometries (fig. fll). 
This complexity arises from the non-uniformity of indi- 
vidual objects (variable grain size, type and age of veg- 
etation, variable lithology and fracture density ...), the 
large range of characteristic spatial scales (from sand to 
boulders, grass to trees) or its absence (fractures for in- 
stance). This makes the classification of raw 3D point 
cloud data arguably more complex than artificial struc- 



tures such as roads or buildings which have simpler geo- 
metrical characteristics (e.g., plane surface or sharp an- 
gles) 

4. As technology evolves, data sets are denser and 
larger which means that projects with billions of points 
are likely to become common in the next decade. Au- 
tomatic processing is thus urgently needed, together 
with fast and precise methods minimizing user input for 
rapidly classifying large 3D points clouds. 

To our knowledge no technique has been proposed to 
classify natural 3D scenes as complex as the one in fig. [T] 
into elementary categories such as vegetation, rock sur- 
face, gravels and water. Classification of simpler environ- 
ments into fiat surfaces and vegetation has been studied 
for ground robot navigation |45l E5] using purely geo- 
metrical methods, but was limited by the difficulty in 
choosing a specific spatial scale at which natural geomet- 
rical features must be analyzed. Classification based on 
the reflected laser intensity has recently been proposed 
|11| . but owing to the difficulty in correcting precisely 
for distance and incidence effects (e.g. [19| 1^), it can- 
not yet be applied to 3D surfaces. Classification based 
on RGB imagery can be used in simple configurations to 
separate vegetation from ground for instance [24]. But 
for large complex 3D environment, the classification ef- 
ficiency is limited by strong shadow projections (fig. IT]), 
image exposure variations, effects of surface humidity as 
well as the limited separability of spectral signature of 
RGB components [24_. Moreover, when the objective is 
to classify objects of similar RGB characteristics but dif- 
ferent geometrical characteristics (i.e. flat bed vs ripples, 
fresh bedrock vs rockfall) , only geometry can be used to 
separate points belonging to each class. 

In this paper, we present a new classiflcation method 
for 3D point clouds speciflcally tailored for complex nat- 
ural environments. It overcomes most of the difficulties 
discussed above: it is truly 3D, works directly on point 



clouds, is largely insensitive to shadow effects or changes 
in point density, and most importantly it allows some 
degree of variability and heterogeneity in the class char- 
acteristics. The set of softwares designed for this task 
(the CANUPO suite) is coded to handle large point cloud 
datasets. This tool can be used simply by non-specialists 
of machine learning both in an automated way and also 
by allowing an easy control of the classification process. 
Because geometrical measurements are independent of 
the instrument used (which is not the case for reflected 
intensity |19jor RGB data), classifiers defined in a given 
setting (i.e. mountain rivers, salt marsh environment, 
gravel bed river, cliff outcrop...) can be directly reused 
by other users and with different instruments without a 
mandatory phase of classifier reconstruction. 

The strength of our method is to propose a reliable 
classification of the scene elements based uniquely on 
their 3D geometrical properties across multiple scales. 
This allows for example recognition of the vegetation on 
complex scenes with very high accuracy (i.e. « 99.6% 
in a context such as fig. IT]). We first present the study 
sites and data acquisition procedure. We then introduce 
the new multi-scale dimensionality feature that is used 
to describe the local geometry of a point in the scene and 
how it can characterizes simple elementary environment 
features (ground and vegetation). In section 4, we de- 
scribe the training approach to construct a classifier: it 
aims at automatically finding the combination of scales 
that best allows the distinction between two or more fea- 
tures. The quality of the classification method is tested 
on two data sets: a simple case of riparian vegetation 
above sand, and a more complex, multiple class case of a 
mountain river with very pronounced heterogeneity and 
3D features (fig. [ij. Finally, we discuss the limitation 
and range of application of this method with respect to 
other classification methods. 



2 Study sites and data acquisition 

The method is tested on two different environments : 
a pioneer salt marsh environment in the Bay of Mont 
Saint-Michel (France) scanned at low tide consisting of 
riparian vegetation of 10 to 30 cm high above a sandy 
ground either flat or with ripples of a few cm height (fig. 
[4] and IgI ; and a steep section of the Otira River gorge 
(South Island of New-Zealand) consisting of bedrock 
banks partially covered by vegetation and an alluvial bed 
composed of gravels and blocks of centimeter to meter 
size (fig. nj). Both scenes were scanned using a Leica 
Scanstation 2 mounted on a survey tripod at 2 m above 
ground in the pioneer riparian vegetation or on the bank 
as in figure [T] for the Otira River. The Leica Scansta- 
tion 2 is a single echo time-of-fiight lidar using a green 
laser (532 nm) with a practical range on natural sur- 
faces varying from 100 to 200 m depending on surface 
reflectivity. When the laser incidence is normal to an 
immobile water surface, the laser can penetrate up to 
30 cm in clear water and return an echo from the chan- 



nel bed. This was the case in some part of the Otira 
Gorge scene. However, on turbulent white water, the 
laser is directly reflected from the surface or penetrates 
partially the water column[2H]- Hence, the water surface 
becomes visible as highly uncorrelated noisy surface (fig. 
[1]). Quoted accuracy from the constructor given as one 
standard deviation at 50 m are 4 mm for range measure- 
ment and 60 prad for angular accuracy. Repeatability 
of the measurement at 50 m was measured at 1.4 mm 
on our scanner (given as one standard deviation). Laser 
footprint is quoted at 4 mm between 1 and 50 m. This 
narrow footprint allows the laser to hit ground or cliff 
point in relatively sparse vegetation. But this also gener- 
ates a small proportion of spurious points called mixed- 
point (e.g. |17|, |25] ) at the edges of objects (gravels, 
stems, leaves ....). The impact of these spurious points 
on the classification procedure is addressed in the dis- 
cussion section. 

Point clouds used for the tests were acquired from a 
single scan position as it corresponds to the worst case 
scenario with respect to shadow effects and change in 
point density. In the Otira River, the horizontal and 
vertical angular resolution were (0.031°, 0.019°) with a 
range of distance from the scanner from 15 to 45 m. This 
corresponds to point spacing ranging from 5 to 24 mm. 
To speed up calculation during the classification tests, 
the data were sub-sampled with a minimum point dis- 
tance of 10 mm leaving 1.17 million points in the scene. 
Parameters for the riparian vegetation environment were 
(0.05°, 0.014°) for the angular resolution and a distance 
of 10 to 15 m from the scanner. This corresponds to 
point spacing varying from 2.4 mm to 13 mm for about 
640000 points in the dataset used for classification tests. 
No further treatment was applied to the data. 

3 Multi-scale local dimensionality feature 

The main idea behind this feature is to characterize the 
local dimensionality properties of the scene at each point 
and at different scales. By "local dimensionality" we 
mean here how the cloud geometrically looks like at a 
given location and a given scale: whether it is more like 
a line (ID), a plane surface (2D), or whether points are 
distributed in the whole volume around the considered 
location (3D). For instance, consider a scene comprising 
a rock surface, gravels, and vegetation (e.g. fig. fll): at a 
few centimeter scale the bedrock looks like a 2D surface, 
the gravels look 3D, and the vegetation is a mixture of el- 
ements like stems (ID) and leaves (2D). At a larger scale 
(i.e. 30 cm) the bedrock still looks mostly 2D, the gravels 
now look more 2D than 3D, and the vegetation has be- 
come a 3D bush (see figJTl. When combining information 
from different scales we can thus build signatures that 
identify some categories of objects in the scene. Within 
the context of this classification method, the signatures 
are defined automatically during the training phase in 
order to optimize the separability of categories. This 
training procedure is described in section 4. 



Fig. 2 Neighborhood ball at different scales. In this 
representation, outside points (gray stars) can be on the 
side but also behind the neighborhood ball. 
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The cloud has a different aspect at 
each scale (here ID, then 2D, then 3D) 



There exists already various ways to characterize the 
dimensionality at different scales and to represent niul- 
tiscale relations. For example the fractal dimension [5j 
and the multifractal analysis [471 . However these are 
not satisfying for our needs. The fractal dimension is a 
single value that synthesize the local space-filling prop- 
erties of the point cloud over several scales. It does not 
match the intuitive idea presented above in which we 
aim at a signature of how the cloud dimension evolves 
over multiple scales. The multifractal analysis synthesize 
in a spectrum how a signal statistical moments defined 
at each scale relate to each other using exponential fits 
(see |47] for more precise definitions, we only give the 
main idea here as this is not the main topic of this arti- 
cle). Unfortunately the multifractal spectrum does not 
offer a discriminative power at any given scale, almost 
by definition (i.e. it uses fits over multiple scales). Our 
goal is to have features defined at each scale and then 
use a training procedure to define which combination 
of scales allows to best separate two or more categories 
(such as ground or vegetation). Some degree of classifi- 
cation is likely possible using the aforementioned fractal 
analysis tools, but our new technique is more intuitive 
and arguably better suited for the natural scenes we con- 
sider. In the following we describe how the multi-scale 
dimensionality feature is defined using the example of the 
simple pioneer salt marsh environment in which only 2 
classes exists : riparian vegetation (forming individual 
patches) and ground (fine sand) Q. More complex 3D 
multiclass cases (as in fig. fTl) are addressed in section 



3.1 Local dimensionality at a given scale 

Let C — {Pi = {xi^yi, Zi)}i^i jsf be a 3D point cloud. 
The scale s is here defined as the diameter of a ball cen- 
tered on a point of interest, as shown in Fig. |2] For each 
point in the scene the neighborhood ball is computed at 
each scale of interest, and a Principal Component Analy- 
sis (PCA) [40 is performed on the re-centered Cartesian 
coordinates of the points in that ball. 

Let Ai , i = 1 . . . 3 be the eigenvalues resulting from the 
PCA, ordered by decreasing magnitude: Ai > A2 > A3. 
The proportion of variance explained by each eigenvalue 
^^^ Pi ^ A +a'+a. ■ ^'^S- 3 shows the domain of all possible 
proportions. 

When only a single eigenvalue Ai accounts for the 
total variance in the neighborhood ball the points are 



Fig. 3 Representing the eigenvalues repartitions for the 
local neighborhood PCA in a triangle. 



Domain of possible proportions 
of eigenvalues pi,P2:P3 for 
a 3D point cloud PCA 




(Ai+A2 + A3) 



distributed only in one dimension around the reference 
scene point. When two eigenvalues are necessary to ac- 
count for the variance but the third one does not con- 
tribute the cloud is locally mostly two-dimensional. Sim- 
ilarly a fully 3D cloud is one where all three eigenvalues 
have the same magnitude. The proportions of eigenval- 
ues thus define a measure of how much ID, 2D or 3D 
the cloud appears locally at a given scale (see Figs. [2] 
and pi. Specifying these proportions is equivalent to 
placing a point X within the triangle domain in Fig. [31 
which can be done using barycentric coordinate inde- 
pendently of the triangle shape. Given the constraint 
P1+P2+P3 — 1, a two-parameter feature for quantifying 
how 1D/2D/3D the cloud appears can be defined at any 
given point and scale. 

A related measure has been previously introduced for 
natural terrain analysis in the context of ground robot 
navigation |45j ,23i and urban lidar classification [9, . In 
these applications, the eigenvalues of the PCA are used 
only as ratios that are compared to three thresholds in 
order to define the feature space. In the present study we 
not only consider the full triangle of all possible eigen- 
value proportions, as shown in [3] but also span the fea- 
ture over multiple scales. The "tensor voting" technique 
from computer vision research predates our work in its 
use of eigenvalues to quantify the dimensionality of the 
lidar data cloud |38J [5D] , although with a different algo- 
rithmic approach. Our work is to our best knowledge 
the first to combine the local dimensionality character- 
ization over multiple scale^ We chose PCA as it is 
a simple and standard tool for finding relevant direc- 
tions in the neighborhood ball \W]. Other projections 
techniques (e.g. non-linear) could certainly be used for 
defining different descriptors of the neighborhood ball 
geometry, but our results below show that PCA is good 
enough already. 



^ We thank the editor for these references and remarks on our 
work. 



3.2 Multiple scales feature 

The treatment described in the previous section is re- 
peated at each scale of interest (see Fig. l2l. Given Ng 
scales, we thus get for each point in the scene a feature 
vector with 2.Ns values. This vector describes the local 
dimensionality characteristics of the cloud around that 
point at multiple scales. In the context of ground based 
lidar data there may be missing scales, especially the 
smallest ones, because of reduced point density, nearby 
shadows or scene boundaries. In that case the geometric 
properties of the closest available larger scale is propa- 
gated to the missing one in order to complete the 2.Ns 
values. Fig. |4] shows an example of how a scene appears 
using this representation for 4 scales. 

Fig. 4 Density plots of a scene represented in the pro- 
posed feature space at different scales. 




ID scale = 15 cm 2D ID scale = 20 cm 2D 

Top : excerpt from a point cloud acquired in the Mont 

Saint-Michel bay salt marshes (Fr), in a zone of pioneer 
riparian vegetation and sand (point spacing from 2.3 to 14 
mm). Bottom (with color available online): Dimensionality 
density diagrams for one vegetation patch (blue, appearing 
as dark gray when printed as gray), a patch of ground (red, 

appearing as dark gray on the triangles bottom right 2D 

region), and all other points of the scene (light gray). Each 

triangle is a linearly transformed version of the space in Fig. 

Islat the indicated scale. Each corner thus represents the 

tendency of the cloud to be respectively ID, 2D, or 3D. 

Note how a single patch of vegetation (in blue in Fig. 
l4| defines a changing pattern at different scales, but re- 
mains separated from the ground (in red), hinting at a 
classification possibility. However the rest of the scene 
(unlabeled, gray points) is spread through the whole tri- 



angle at each scale: there is no clear cut between veg- 
etation and ground at any given scale. The solution is 
brought by considering the multiscale vector in its en- 
tirety, as a high-dimensional description, and not as a 
succession of 2D spaces. This is described in the next 
section. 

4 Classification 

The general idea behind the classification procedure is 
to define the best combination of scales at which the 
dimensionality is measured, that allows the maximum 
separability of two or more categories. Practically, the 
user could have an intuitive sense of the range of scales at 
which the categories will be the most geometrically dif- 
ferent, but in many cases, because of natural variability 
in shape and size of objects, this is not a trivial exer- 
cise. We thus rely on an automated construction of a 
classifier that finds the best combination of scales (i.e. 
all scales contribute to the final classification but with 
different weights) that maximizes the separability of two 
categories that the user has previously manually defined 
(i.e. samples of vegetation and samples of ground seg- 
mented from the point cloud). In the following we de- 
scribe the construction of the classifier and then present 
in section 5 typical classification results and step-by-step 
application to natural data sets. 

4.1 Probabilistic classifier in the plane of 
maximal separability 

The full feature space of dimension 2.Ns is now consid- 
ered in order to define a classifier that takes advantage 
of working simultaneously on the data representation at 
multiple scales. This classifier is defined in two steps: 1. 
by projecting the data in a plane of maximal separabil- 
ity; and 2. by separating the classes in that plane. The 
main advantage of processing this way is to get an easy 
supervision of the classification process. Visual inspec- 
tion of the classifier in the plane of maximal separability 
is very intuitive, which in turn allows for an easy im- 
provement of the classifier if needed (e.g. changing the 
separation line in Fig. p^to make a non- linear classifier r] 
The plane of maximal class separability is intuitively 
like a PGA where only the 2 main components are kept, 
except that it optimizes a class separability criterion in- 
stead of maximizing the projected variance as the PGA 
would do. In principle any classifier allowing a projec- 
tion on a subspace can be used in an iterative procedure 



^ Human intervention at this point allows for a powerful pat- 
tern recognition beyond the capacities of the simple classifiers pre- 
sented here. Moreover some practical applications may require 
imbalanced accuracies for each class. For example one may prefer 
to increase the confidence in removing all the vegetation at the 
expanse of loosing a few data points of ground. Allowing easy 
user intervention by means of a graphically tunable classifier in a 
2D plane of maximal separability nicely offers these two advan- 
tages: improved pattern recognition and adaptability. Automated 
processing is of course also possible and in fact forms the default 
classifier on which the user can intervene if so desired. 



(including non- linear classifiers with the kernel trick, see 
|27|). In the present work two linear classifiers are con- 
sidered: Discriminant Analysis |3J and Support Vector 
Machines [7]. The rational is to assert the usefulness 
of our new feature for discriminating classes of natural 
objects. Comparing the results obtained with these two 
widely used linear classifiers validates that the newly in- 
troduced feature does not depend on a complex statis- 
tical machinery to be useful. We stress that last point: 
using one or the other of these classifiers has little impact 



Fig. 5 Classifier definition in the plane of maximal sep- 
arability. 



in practice (see the results in section 5.1), but we had 



to demonstrate this is actually the case and that using 
a simple linear classifier is good enough for our use. 

Let F = {X = {xo,yQ,Xi,yi,...,XNs,yNs)} be the 
multiscale feature space of dimension 2.Ng, with (xi,yi) 
the coordinates within each triangle in Fig. |4] Consider 
the set of points F'^ and F~ labeled respectively by -1-1 
or —1 for the two classes to discriminate (ex: vegetation 
against ground). A linear classifier proposes one solution 
in the form of an hyperplane of F that best separates F+ 
from F~ . That hyperplane is defined by w^X — b = 
with w a weight vector and b a bias: 

• Linear Discriminant Analysis proposes to set w = 
(El -I- S2)~ (M2 — A^i) where Ec and /Xc are the co- 
variance matrix and the mean vector of the samples 
in class c. 

• Support Vector Machines set w so as to maximize 
the distance to the separating hyperplane for the 
nearest samples in each class. The Pegasos approach 
described in |39l I21| is used here to compute w since 
it is adapted to cases with large number of samples 
while retaining a good accuracy. 

In each case the bias b is defined using the approach 
described in [33], which gives a probabilistic interpre- 
tation of the classification: the distance d of a sample 
to the hyperplane corresponds to a classification confi- 
dence, internally estimated by fitting the logistic func- 
tion p{d) ^ j^^^^^j^^^y 

The feature space F is then projected on the hyper- 
plane using w and b, and the distance to the hyperplane 
di = wfX — bi is calculated for each point. The process 
is repeated in order to get the second-best direction or- 
thogonal to the first, together with the second distance 
^2. The couple (di, ^2) is then used as coordinates defin- 
ing the 2D plane of maximal separability. Since there is a 
degree of freedom in choosing w, b such that uF X—b = 0, 
both axis can be rescaled such that a — 1. Thus the co- 
ordinates (di, d2) in the separability plane are now con- 
sistent in classification accuracjjj This consistency al- 
lows some post-processing in the plane. With the current 
definition most classifiers would squash the data toward 




p<classiO>95% ■ 



^ To our knowledge this way of defining a 2D visualization in 
a plane of maximal separability, while retaining an interpretation 
of the scales in that plane using confidence values, is an original 
contribution of this work. 



Color is available online. Blue (dark gray): vegetation sam- 
ples. Red (light gray): soil. The classifier was obtained au- 
tomatically with a linear SVM using the process described 
in Section |4.1| in order to classify the benchmark described 
in Section [5.1 1 The confidence level is given for the horizon- 
tal axis. The scaling for the Y axis has no impact on the 
automated classification performance but offers a better vi- 
sualization, which is especially useful when the user wishes 
to modify this file graphically. 



the X = Y diagonajj The post-processing consists in 
rotating the plane so that the class centers are aligned 
on X, and then scaling the Y axis so the classes have 
the same variance on average in both direction. This 
last step is completely neutral with respect to the au- 
tomated classifier that draws a line in the plane (the 
optimal line could be defined whatever the last rotation 
and scaling). However it is now much easier to visually 
discern patterns within each class in the new rotated 
and rescaled space, as can be seen is Fig. [5] That figure 
shows an example of classifier automatically obtained us- 
ing the data presented in Section |5.1[ The given scale 
of 95% classification confidence is valid along the X axis 
and the corresponding factor for the Y axis is indicated. 

4.2 Semi-supervised learning 

One goal in developing this classification method was 
to minimize user input (i.e. manually extracting and 
labeling data in the scene is cumbersome) while maxi- 
mizing the generalization ability of the classifier . This 

* To see why, imagine the data being projected on the X axis 
with negative coordinates for class 1 and positive coordinates for 
class 2. The Y axis (second projection direction) also projects class 
1/2 to negative/positive coordinates. Hence the data is mostly 
concentrated along the diagonal. 



is achieved by semi-supervised learning: using the infor- 
mation which is present in the unlabeled points. The 
plane of maximal separability is necessarily computed 
only with the labeled examples. We search for a direction 
in this plane which minimizes the density of all points 
along that direction (labeled and unlabeled), while still 
separating the labeled examples. The assumption is that 
the classes form clusters in the projected space, so mini- 
mizing the density of unlabeled points should find these 
clusters boundaries. When no additional unlabeled data 
are present the classes are separated simply with a line 
splitting both with equal probability. 



For a multi-class scenario (see Section 5.2 1 the final 
classifier is a combination of elementary binary classi- 
fiers. In that case it may be that some cluster in the 
unlabeled data corresponds to another class than the 
two being classified, which would fool the aforementioned 
density minimization. A workaround is to use only the 
labeled examples, or to rely on human visual recognition 
to separate the clusters manually. 

More generally the ability to visualize and keep control 
of the process (this is not a "black box" tool) allows to 
tap on human perception to better separate classes. But 
the ability to fully automate the operations is retained, 
which is especially useful for large batch processing. The 
user can always review the classifier if needed. 

We developed a tool usable by non-specialists: the 
classifier is provided in the form of a simple graphics 
file that the user can edit with any generic, commonly 
available SVG editoij^ The decision boundary can be 
graphically modified, thus quickly defining a very pow- 
erful classifier with minimal user input . This step is fully 
optional and the default classifier can of course be taken 
without modification. 



4.3 Optimization 

The most time-consuming parts of the algorithm are 
computing the local neighborhoods in the point cloud at 
different scales in order to apply the local PGA transform 



see Section 4.1), as well as the SVM training process 



(computing the Linear Discriminant Analysis is fast and 
not an issue, although even when using a SVM, train- 
ing a classifier is only needed once per type of natural 
environment). We address these issues by allowing to 
compute the multiscale feature on a sub-sampling of the 
scene called core points. The whole scene data is still 
considered for the neighborhood geometrical character- 
istics, but that computation is performed only at the 
given core points. 

This is a natural way of proceeding for lidar data: 
given the inhomogeneous density there is little interest 
in computing the multiscale feature at each point in the 
densest zones. A spatially homogeneous density of core 
points is generally sufficient and allows an easier scene 



manipulation and visualizatioij^ However the extra data 
available in the densest zones is still used for the PGA 
operation, which results in increased precision compared 
to far away zones with less data points. We also pre- 
serve the local density information and the classification 
confidence around each core point as a measure of that 
precision. When classifying the whole scene, each scene 
point is then given the class of the nearest core point. 

As a result the user is offered a trade-off between com- 
putation time and spatial resolution : it is possible to call 
the algorithm on the whole scene (each point is a core 
point) or to call the algorithm on a sub-sampling of the 
user choice (e.g., an homogeneous spatial density of core 
points) . 

5 Results 

5.1 Quantitative benchmark on ground 
and riparian vegetation classification 

In order to quantitatively assess the performance of the 
classifier, examples were selected from the pioneer salt 
marsh scene (see Fig. H for an excerpt of this scene) in 
which two classes can be defined : riparian vegetation 
and ground. These examples represent various vegeta- 
tion patch sizes and shapes, shadow zones, flat ground, 
small ripples, data density changes and multiple scan- 
ner positioning. The data set comprises approximately 
640000 points, manually classified into 200000 belonging 
to vegetation and 440000 for ground. This data set is 
provided online together with the software (link given at 
the end of this paper) so it can be reused for comparative 
benchmarks. 

The classifier is trained to recognize vegetation from 
ground in the first set of examples, using about half of 
the aforementioned data. Its performance is then as- 
sessed on a the remaining half of the data that was not 
used for training. This is not only the standard proce- 
dure in the machine learning field (to detect when the 
algorithm learns details of a particular data set that are 
not transposable to other data sets, i.e. the over-fitting 
issue), but also what is expected from our new tech- 
nique. We aim at an excellent generalization ability: the 
algorithm must be able to recognize the vegetation in 
unknown scenes, not only just on the samples it was 
presented. 

We use the balanced accuracy measure to quantify 
the performance of the classifier in order to account 
for the different number of points in each class. With 
tv, tg, fv, fg the number of points truly(i)/falsely(/) 
classified into the vegetation(u)/ground((7) classes, the 
balanced accuracy is classically defined as ba = 
^ (a^, -|- Gg) with each class accuracy defined as Uy = 
. *,", and Oq = . *^, . We use the Fisher Discriminant 

tv+fg 9 tg+fv 

Ratio fdr [llj in order to assess the class separability. 



For example Inkscape, available at jhttp : //www ■ inkscape . | 



org/ (as of 2012/01/19) 



* Both spatially homogeneous sub-sampling and scene manipu- 
lation are easy to perform with free softwares like |CloudCompare| 
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The performances of each classifier is measure using the Bal- 
anced Accuracy (ba) and the Fisher Discriminant Ratio (fdr). 
Both are described in the main text. 

Tab. 1: Quantitative benchmark for separating vegeta- 
tion from ground. 



The classifier assigns for each sample a signed distance d 
to the separation line, using negative values for one side 
and positive values for the other. The measure of sep- 
arability is defined as fdr = {1x2 — Mi) / (^1 + "^2) with 
/ic and Vc the mean and variance of the signed distance 
d for each class c. Note that the class separability could 
still be high despite a mediocre accuracy (e.g., separa- 
tion line positioned on a single side from both classes). 
This would merely indicate a bad training with poten- 
tial for a better separation. Hence both ba and fdr are 
useful measures for asserting separately the role of the 
classifier and the role of the newly introduced feature in 
the final classification result. A large ba value indicates a 
good recognition rate (6a — 50% indicates random class 
assignment) on the given data set, and a large fdr value 
indicates that classes are well separated (an indication 
that the ba score is robust). 

Table [1] shows the results of the benchmark. The clas- 
sifier that was used is fully automated, without human 
intervention on the decision boundary, and taking 19 
scales between 2cm and 20cm every cm (larger scales do 
not improve the classification, see Fig. 5 for the typical 
vegetation size « 40cm). We used our software default 
quality / computation time trade-off for the support vec- 
tor machine classifier training in order to adequately 
assess the results of our algorithm in usual conditions. 
The algorithm was forced to classify each point, while 
in practice the user may decide to ignore points with- 
out enough confidence in the classification (see Section 
5.2). Nevertheless the balanced accuracy that was ob- 
tained both on the training set and the testing set is very 
good. This not only shows that the algorithm is able to 
recover the manually selected vegetation/soil (train set 
accuracy) but that it is able to generalize to terrain data 
it had not seen before. This is of great importance for 
large campaigns: we can train the algorithm once on a 
given type of data and then apply the classifiers to a 
large quantity of further measurements without having 
to re-train the algorithm. 

Table |2] shows the result of the classification using sin- 
gle scales only. The advantage of using a multi-scale 



classifier is apparent: it offers a better accuracy than any 
single scales alone. The difference is more pronounced 
for the discriminative power, with the multi-scale classi- 
fier offering almost twice as much class separability. Al- 
though this is the expected behavior, some classifiers are 
sensitive to noise and adding scales with no information 
would potentially decrease the multi-scale performance. 
The scales from 2cm to 20cm not shown in Table |2] have 
similar properties and performance levels, with slightly 
better results for single scales between 5 and 10 cm. Even 
with this observed performance peak there is no charac- 
teristic scale in this system as discriminative information 
is present at all scales: the point of the multi-scale clas- 
sifier is precisely to exploit that information. 

In this example, both classifiers (LDA and SVM) give 
the same results at each scale, and are equally suitable 
for the multiscale feature (Table flj. In other scenarios 
the situation might be different, but overall this confirm 
our method does not need a complicated statistical ma- 
chinery (like the SVM) for being effective, and using a 
simple linear classifier (like the LDA) is good enough. 
In any case we achieve at least 97.5% classification ac- 
curacy. 

Figure |6] visually shows the result of the classifica- 
tion on a subset of the test data using the multi-scale 
SVM classifier obtained with the fully automated pro- 
cedure. Points with a low classification confidence are 
highlighted in blue. They correspond mostly to the 
boundary between ground and vegetation. Figure [6] 
shows that the algorithm copes very well with the irreg- 
ular density of points, the shadow zones and the ripples. 
The actual classifier definition is shown in Fig. |5] 



5.2 3D multiscale classifiers with multiple 
classes 

5.2.1 Dealing with multiple class 

Combining multiple binary classifiers into a single one 
for handling multiple classes is a longstanding problem in 
machine learning |3T] . Typically the problem is handled 
by training "one against one" (or "one against others") 
elementary binary classifiers, which are then combined 
by a majority rule. This is what the automated tool suite 
CANUPO proposes in the present context, following the 
common practice in the domain. 

Additional extensions are of course possible in fu- 
ture works. Recent developments on advanced statis- 
tical techniques [II] deal with the issue of training and 
then combining the elementary classifiers. However in 
the present context we wish to retain a possible interven- 
tion on the classifiers using a graphical editor. Moreover 
context-dependent choices like favoring one class over the 
other need to be allowed. It may thus be more efficient 
to separate classes one by one and combine the results, 
as is explained in the next section. 
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The results for both classifiers differ only at the fourth digit for the Balanced Accuracy (6a) and at the third for the Fisher 
Discriminant Ratio (fdr), so the tables were merged. 



Tab. 2: Single scale benchmark results at selected scales 



Fig. 6 Excerpt of the quantitative benchmark test set classification 




Color is available online. White: Points recognized as ground. Green (light gray): Points recognized as vegetation. Blue(dark 
gray): Points for which the confidence in the classification is less than 80%. Scale is in meters. 



5.2.2 Application to a complex natural environment 

In the following we illustrate the capabilities of the 
method in classifying complex 3D natural scenes. A sub- 
set of the Otira River scene (fig. fl| was chosen, and four 
main classes defined: vegetation, bedrock surface (on the 
channel bank and large blocs), gravel surfaces and water. 
Figure |7] presents the dimensionality density diagrams of 
one training patch for each class and scales ranging from 
5 to 50 cm. As intuitively expected, vegetation is mostly 
ID and 2D at small scale (leaves, stem) and becomes 
dominantly 3D at scales larger than 15 cm. However, 
the clustering of points is only significant at scales larger 
than 20 cm. Bedrock surfaces are mostly 2D at all scales, 
with some 1D-2D features occurring at fine scales corre- 
sponding to fractures. Gravel surfaces exhibit a larger 
scatter at all scales owing to the large heterogeneity in 
grain sizes. The 3D component is more important at 
intermediate scales (10 to 20 cm) than at small or very 
large scales. This illustrates the transition from a scale of 
analysis smaller than the dominant gravel size (i.e., grav- 
els appears as dominantly 2D curved surfaces) , and then 
larger than an assemblage of gravels (i.e., gravel rough- 
ness disappears). As explained in section [2] whitewater 
surface can be picked by the laser, whereas in general it 
does not reflect on clear water [5S]. Yet, even at small 
scale the water does not appear purely 2D as the water 
surface is uneven and the laser penetrates at different 
depth in the bubbly water surface. Indeed, the signa- 
ture is quite multidimensional for scales up to 20 cm, 
and only around 20 cm does the water surface appear to 
significantly cluster along a 2D-3D dimensionality. At 
larger scale, the water becomes significantly 2D. 

The multi-scale properties of the various classes show 
that there is not a single scale at which the classes could 
be distinguished by their dimensionality. Vegetation and 
bedrock are quite distinct at large scale, but bedrock, 
gravel and water are too similar at this scale to be labeled 
with a high level of confidence. Only at smaller scales 
(10-20 cm) can bedrock be distinguished from gravel and 
water. This visualization also shows that gravel and wa- 
ter will be difficult to distinguish owing to their very 
similar dimensionality across all the scales. 

In the following, approximatively 5000 core points for 
each class were selected for the training process. Their 
multiscale characteristics were estimated using the com- 
plete scene rather than excerpts of the class only. Points 
in the original scene have a minimal spacing of 1 cm 
corresponding to ~ 1.17 million points. The actual clas- 
sification operates on subset of 330000 core points with 
a minimum spacing of 2 cm. The multi-classes labeling 
was achieved using a series of 3 binary classifiers (fig. Isl 
all using the same set of 22 scales (from 2 cm to 1 m). 
An automated classification (i.e., the only user interac- 
tion was in defining the classes and the initial training 
sets) is presented, as well as examples of possible user 
alterations. These alterations are of three types : chang- 
ing the initial training sets, modifying the classifier, and 
defining a classification confidence interval. Given that 



users improvements depend on specific scientific objec- 
tives (e.g., documenting vegetation, characterizing grain 
sizes or measuring surface change), they cannot all be 
discussed completely here. We present a case in which 
the classification of bedrock surfaces was slightly opti- 
mized. The LDA approach was used for all classifier 
definitions as the results were on par or slightly better 
than a SVM approach. Figure [9] presents the results for 
the original data, the automated and the user-improved 
classification results. 

The first classifier separates vegetation from the three 
other classes. The automated training procedure results 
in a ba of 99.66 % approaching perfect identification of 
vegetation on the training sets. The very high level of 
separability is refiected by a large fdr value (11.67) and 
a very small classification uncertainty in the projected 
space (fig. [Sl. As shown in figure M the automated 
classification of vegetation is excellent with very lim- 
ited false positives appearing in overhanging parts of 
large blocs where the local geometry exhibits a dimen- 
sionality across various scales too similar to vegetation. 
The precision of the labeling is also excellent as small 
parts of bedrock between or behind vegetation are cor- 
rectly identified, and small shrubs are correctly isolated 
amongst bedrock surfaces. Nevertheless it is still possible 
to improve this classifier by using the incorrectly classi- 
fied overhanging blocs in the training process (5000 core 
points were added). This 5 minutes operation results in 
a better handling of false vegetation positive, and retains 
excellent characteristics on the original training sets (6a 
= 98.2 %, fdr = 9.89). A classification confidence inter- 
val of 90 % was also set visually in the CloudCompare 
software |12| by displaying the uncertainty level of each 
core point and defining the optimum between quality 
and coverage of the classification. This left aside 5.7 % 
of the original scene points unlabeled. 

Classifier 2 separates bedrock surfaces from water and 
gravel surfaces (fig. ^. The automated training proce- 
dure lead to a ba of 95.7 % and fdr of 6.21 on the original 
training sets. Because gravels exhibits a wide range of 
scales from pebbles to boulders, it is not possible to fully 
separate the bedrock and gravel classes as the largest 
gravels tend be defined as rock surfaces. Fracture and 
sharp edges of blocs tend to be classified as non-bedrock 
as they are 3D feature at small scale and 2D as large 
scale (as is gravel). Yet, as in the previous case, the 
confidence level defined at 0.95 remains small compared 
to the size of the two clusters in the projected space. 
While the original classifier was already quite satisfac- 
tory, it was tuned to primarily isolate rock surfaces by 
changing manually the classifier position in the hyper- 
plane projection {ba = 92.3 %, fdr = 6.31 ,fig. Isl). A 
classification confidence interval of 80 % was also used 
which left 17 % of the remaining points unlabeled (fig. 

Classifier 3 separates water from gravel surfaces (clas- 
sifier 3, fig. Isl. The automated training procedure lead 
to a ba of 83,2 %. As expected from the similarity of 
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Fig. 7 Multiscale. multi-class scenario 

3D 




Left : excerpt from the point cloud of fig. [T] Riglit dunensionality diagrams for the four main classes of a mountain river 
environment at scales ranging from 5 to 50 cm. Colors from blue to red correspond to the density of points from the training 
dataset and characterize the degree of clustering around a given dimensionality. 



Fig. 8 Classification process and classifiers for the multiscale case 
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Semi-supervised classifiers for the Otira river dataset and classification procedure diagram. For each classifier, the gray area 
indicates the portion of space in which the classification confidence is lower than 0.95. For Classifier 2, the manually modified 
supervised classifier is targeted to preserve more systematically bedrock surfaces at the expense of non-bedrock surfaces. 
In the classification procedure diagram, percents indicate the proportion of each class in the supervised classification which 
uses confidence intervals of 0.9 for Classifier 1, 0.8 for Classifier 2 and Classifier 3. 
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the dimensionality density diagrams (fig. l7|, the two 
classes are more difliicult to separate than in the previ- 
ous two classifications and the confidence level defined 
at 0.95 overlaps significantly the two classes. Yet, fig- 
ure [9] shows that the classifier manages to correctly la- 
bel the Whitewater and gravel surfaces corresponding to 
non-trained datasets. Being quite effective, the default 
classifier was not altered. A confidence interval of 80 % 
was used resulting in 78 % of the remaining points being 
labeled. 

The end result of this process is a 3D scene (fig. [9]). 
As shown in fig. [9] the default parameters already give 
an excellent first order classification. The fine-tuning 
previously described do not represent the best classifi- 
cation possible, but rather an example of how the au- 
tomated approach can be rapidly tweaked to give some 
improvement. On a practical note, the simplest way to 
improve the default classifier in this example is to add in 
the training process some of the false positive and false 
negative results of a first training, rather than manually 
altering the classifier. Defining a confidence level during 
the classification process is very useful as the amount of 
data is so large that labeling only 70 % of the points 
is not detrimental to the interpretation of the results. 
The classes can be further cleaned by removing isolated 
points using the volumetric density of data calculated 
during the multi-scale analysis. 

5.3 Single scale vs Multiple scale 
classification 

Table [3] presents the balanced accuracy of the three clas- 
sifiers used in the Otira river scene (fig. [sl trained with 
the same subsets but using a single spatial scale (5, 10, 
20, 50, 75 or 100 cm). For each classifier, the balanced 
accuracy of the multiple scale classification is systemat- 
ically better than the single scale ones. The improve- 
ment is very significant for Classifier 3 (83.2 % vs 70.9 
% for single class). Most importantly, the separability 
of classes (as measured by the fdr) is always increased 
at least two to three times for Classifiers 2 and 3 (and 
by 40 % for Classifier 1). The increased separability is 
the key advantage of the multi-scale approach. It allows 
a larger geometrical inhomogeneity within a given class, 
and a better generalization of the results than a single 
scale approach. 

Compared to a single scale classification at 1 m, the 
improvement of the multi-scale Classifier 1 (vegetation vs 
not vegetation) seems more marginal. However, by com- 
paring the classification results on the Otira river data, 
the multi-scale classifier is more precise than the single- 
scale case: small shrubs within bedrock, that are not 
correctly classified by the single large scale classifier, are 
correctly retrieved with the multi-scale approach. Sim- 
ilarly, incorrectly classified zones below blocs for both 
classifiers are more extended with the single-scale clas- 
sifier. Therefore the multi-scale classification is qualita- 
tively improved, which is not refiected by the quantita- 
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Tab. 3: Comparison of Balanced Accuracy on Single 

scale and Multi-scale classifier 
Results of semi-supervised LDA classification for single and 
multiple scales using the original training sets of the Otira 
river classifiers (fig. [8|. Results are similar using an SVM 
approach. 



five ba measure alone. 

We conclude that the multi-scale analysis always im- 
prove significantly the classification compared to a sin- 
gle scale analysis on one or all of the following aspects 
: discrimination capacity, separability of the classes and 
spatial resolution. 



6 Discussion / openings 

While many studies have focused on the classification 
of ground vs vegetation, or buildings in 2.5D airborne 
lidar data using purely geometrical approaches (e.g., 
|421l51[5]l. none can really apply to dense 3D point clouds 
obtained from ground based (fixed or mobile) lidar data 
in which a fully 3D approach is needed. Such 3D ap- 
proach have been pursued using a dimensionality mea- 
surement at a given scale |45l E5] to detect ID (e.g. tree 
trunk or cables), 2D (ground) and 3D (vegetation) ob- 
jects. However, because natural surfaces exhibit a large 
range of characteristic scales and natural objects within 
a given class can have a large degree of geometric hetero- 
geneity (i.e. vegetation or sediment), a single scale can 
rarely classify an entire scene robustly. We have thus 
introduced a multi-scale analysis of the local geometry 
of point clouds to cope with the aforementioned issues, 
which exhibits good performance even with simple linear 
classifiers. By doing so the selection of a specific or char- 
acteristic scale is not needed. We have shown that the 
combination of scales systematically improves the sep- 
arability of classes compared to a single scale analysis 
(Table [3J Table Il| vs Table [2]) , sometimes dramatically. 
The multi-scale analysis also allows retention of a de- 
tailed spatial resolution compared to a single large scale 
analysis. To further account for the geometrical com- 
plexity of natural environments, the user is free to set a 
level of confidence in the classification process that will 
control the balance between confidence and coverage of 
the classification. Given the large amount of data avail- 
able in TLS, it is often more interesting to not classify 
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Fig. 9 Result of the classification process for the mountain river dataset 




A: Original Otira River scene (rninirnuin point spacing = 1 cm). B: Default classification (green: vegetation, gray: bedrock, 
red: gravel, blue: water) according to the process described in fig. [s] C : User-improved classification. D : unlabeled points 
(28,2 % of the total points). 
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Fig. 10 Comparison of best single scale vegetation clas- 

sifier with multi-scale classifier 

Singl e-Sca le Classifier (1 m^ 




■^ ^ incorrect vegetation 
/low resolution error 
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Multi-Scale Classifier (0.02-1 m) 




Classification results for vegetation detection using a multi 
or single scale (1 m) classifier. Even though the balanced 
accuracy is similar on the training set the multi-scale classi- 
fication is much more precise and less prone to errors when 
generalized to the whole scene. The single scale classifier 
misses small shrubs on the bedrock and incorrectly classify 
large bloc borders as vegetation. 



30 % of the data, in order to keep 70 % for which classes 
are correctly attributed. 

Because all scales contribute to a varying degree to the 
classification process, the method is relatively robust to 
shadow effects, missing data and irregular point density 
(e.g. fig. 19b andlol: even if the dimensionality cannot be 
characterized over a certain range of scales (e.g. small 
ones because of low point density, large ones because of 
shadow effects), other scales are used to classify a point, 
albeit with a smaller degree of confidence. Interestingly, 
qualitative inspection of the classification results shows 
that obvious spurious mixed points created at the edge 
of objects tend to be classified with a low level of confi- 
dence (provided that the scene has a relatively high point 
density and that the small scale dimensionality signifi- 
cantly contribute to the classification) . This is explained 
by the low point density around mixed points (because 



no real object exists at their location) and the resulting 
lack of a good dimensionality characterization at small 
scale. Although systematically quantifying this effect if 
out of the scope of this work, this observation suggests 
that using a relatively high level of confidence during 
the classification process helps in cleaning the resulting 
classes from spurious mixed-points. 

We chose the dimensionality of the point cloud at a 
given scale as a continuous measure of the local scene 
geometry. This is an intuitive perception of the sur- 
face that can capture many aspects of natural geome- 
tries (|45l E5]). in particular the dichotomy between 3D 
vegetation and 2D surfaces. However, the multi-scale 
classification could also use other geometrical measures 
depending on the final objectives of the classification. 
Surface orientation, curvature, mathematical derivatives 
of a local surface [6_ or the degree of conformity to a 
given geometry (sphere, cylinder ....e.g., [4&,) could also 
have been used in the construction of the classifier. The 
surface angle with the vertical is already implemented 
in the classifier but was not used here. It could be used 
to separate channel banks from river bed for instance, 
or as an additional constraint to discriminate the wa- 
ter surface from the gravels (which are rarely completely 
horizontal compared to water). We note that for vege- 
tation classification, the dimensionality measurement is 
particularly effective and simple to define for 3D point 
clouds. Indeed, even at a single well chosen scale, the 
dimensionality criteria performs already well to detect 
vegetation (i.e., fig. [lO|(|il 



])• 



Each point captured by lidar (airborne or terrestrial) 
also comes with a measure of the reflected laser inten- 
sity and in some cases with optical imagery information 
(RGB) that could be used in the classification process. 
Using the reflected laser intensity for classiflcation pur- 
poses has been attempted for airborne (e.g. |14||T5] ) and 
ground-based hdar till [311 HI [H [TO] . In this latter case, 
the difficulties are numerous as the reflected intensity is 
a complex function of distance from the scanner, inci- 
dence angle, and surface reflectance [TOl ^^ (which on 
top of the physico-chemical characteristics of the mate- 
rial itself, also depends on surface humidity and micro- 
roughness |lll \ST[ [52]V In simple cases for which the 
distance and the incidence angle are not greatly chang- 
ing (cliff survey for instance), the laser intensity can 
be used to distinguish between materials relatively ef- 
ficiently [TT]. It can also improve the robustness of a 
classifier based on simple geometrical parameters {[13]) 
in the case of simple natural environments such as ripar- 
ian vegetation over sand. However, for complex scenes 
such as the Otira river, lidar intensity is more difficult to 
use given the large changes in distances, incidence angles 
and state of the surface (wet vs dry surfaces). More- 
over, no standard exists between scanner manufacturers 
such that even if surface reflectance could be isolated 
from other effects, it would not necessarily be compara- 
ble between various scanner measurements as opposed 
to purely geometrical descriptors that can be used for 
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any source of data (i.e. classifiers can be exchanged be- 
tween users independently of the scanner used to acquire 
the data). Because laser reflected intensity is not glob- 
ally nor temporally consistent on a complex 3D scene, 
we conclude that it cannot yet be used as a primary 
classifier of complex 3D natural scenes. Development of 
precise geometrical corrections factors for refiected in- 
tensity (e.g., [12]) may allow its future inclusion in the 
process of classifier training and subsequent classification 
to improve the resolution and accuracy of the geometri- 
cal classification. Provision for this is already included 
in the software. 

In the case of airborne lidar, the combination of ge- 
ometrical information and imagery can significantly in- 
crease classification quality (e.g. [ID])- However, RGB 
imagery have rarely been used in the context of 3D ter- 
restrial lidar classification [24]. The main reason is that 
it is much more difficult to have a spatially consistent 
RGB imagery of a 3D complex scene from the ground 
than from air. Indeed, the more 3D and complex is an 
outdoor scene, the more difficult it is to get a consis- 
tent exposure from the ground and from different points 
of view. For instance, in the case of the Otira River, 
the extent of shadows is pronounced owing to the nar- 
row gorge configuration and to the presence of vegeta- 
tion, and variable during one day. Wetted surfaces which 
are common in fluvial environments also have a different 
spectral signature than dry surfaces. Also, RGB imagery 
cannot distinguish first or last laser reflexions in the case 
of the new generation of full-waveform multi-echo scan- 
ners. However, in the context of the riparian vegetation 
case example (fig. p|, and owing to the strong differ- 
ence in spectral characteristics of the vegetation and the 
sandy bed, good success could be expected using RGB 
classification [53]. But this requires the imagery to be 
taken without strong shadows, and in the case of the Le- 
ica Scanstation 2 would be limited by the low resolution 
of the on-board camera and the lack of precise registra- 
tion with the point cloud (typically a few cm difference 
at 50 m) . Classifying ffat versus ripples zones would still 
require a geometrical analysis. 

Because it works in 3D, our method can be used on 
2.5D airborne lidar or mostly 2D point clouds ( |13| [T7\ 
[28jll2]). As shown with the riparian vegetation exam- 
ple, it allows a direct extraction of vegetation on the 
raw data without the need to construct a raster DEM. 
Because of the smaller density of points and the smaller 
range of scales available to characterize trees in full wave- 
form aerial lidar than in ground based lidar, it is not 
certain that the method will perform significantly better 
in defining the ground than existing ones for aerial data. 
However, it should perform well as a generic geometric 
classifier of surfaces. Although not used in this study, 
the multiscale calculation also records the vertical angle 
of the local surface at the largest scale. This could be 
used as an additional constraint to detect buildings from 
road for instance. 



7 Conclusion 

We introduced a new method for classifying 3D point 
clouds, called multi-scale dimensionality analysis, to 
characterize features according to their geometry. We 
demonstrated the applicability of this method in two 
contexts: 1. separating riparian vegetation from the 
ground in the Mt St Michel bay, and 2. recognizing 
rocks, vegetation, water and gravels in a steep moun- 
tain river bed. In each case the classification was per- 
formed with very good accuracy. The method is robust 
to missing data and changes in resolution common in 
ground-based lidar data. By combining various scales, 
the method systematically performs better than a single 
scale analysis and improves the spatial resolution of the 
classification. 

Multi-scale dimensionality analysis proves quite effi- 
cient especially in separating vegetation from the rest 
of the data. Removing (or studying) vegetation is a 
common issue in natural environment studies and this 
method will be useful in this context given that it 
can work directly on the raw data. Typical applica- 
tion include bare ground detection to study sedimenta- 
tion/erosion patterns in fluvial environments f|48p. rock 
face analysis on which vegetation can grow and create 
unwanted noise (|22|, [37]) or analysis of vegetation pat- 
terns and their relation to hydro-sedimentary processes. 

We gave a particular attention to provide tools usable 
by non-specialists of machine learning, while retaining 
the ability to process large batches of data automati- 
cally. This tool set is available as Free/Libre software on 
the first author home pag^ Because it relies only on 
geometrical properties, classifier parameter files can be 
exchanged between users and applied on any geometri- 
cal data without going over the process of training the 
classifier. 
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